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ABSTRACT 


The Dempster Shafer (D-S) Theory of Evidential Reasoning may be useful in handling issues associated 
with theater ballistic missile discrimination. This paper highlights the Dempster-Shafer theory and describes how 
this technique was implemented and applied to data collected by two infrared sensors on a recent flight test. 
Results from both classifier and feature level fusion applications are presented in this paper. 
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1.0 INTRODUCTION 


The purpose of discrimination is to identify the lethal target in the presence of a potentially cluttered 
environment that may include debris, spent boosters or booster segments, and even decoys. There are many problems 
associated with performing the discrimination function including limited a priori data for training and testing, sensor 
measurement error, short observation times, and feature selection. It is possible or, even likely, that 
data collected by an observational system will deviate significantly from the preflight predictions used to train and 
test the algorithm. Sensor measurements may also be sparse, derived features can be conflicting, and predictions 
can be shrouded with uncertainty. Classically, the discrimination function is performed using statistical methods 
derived from Bayes’ Theorem. While Bayesian methods are theoretically optimal under certain conditions, such 
applications require estimates of prior probabilities and decisions are made from disjoint class hypothesis sets. The 
Dempster-Shafer (D-S) theory of evidence potentially provides an alternative approach for object classification 
when conditions are not optimal. D-S theory allows for any union of the single class hypotheses, which could 
prove to better utilize limited data by not forcing a particular classification decision to a single class hypothesis if 
the data is ambiguous. D-S theory also incorporates a formal notation for uncertainty, which results in upper and 
lower bounds on the probability of class membership. This could enable the decision-maker to conclude that there 
is insufficient evidence to make a confident decision. Also, the computational heart of the theory, Dempster’s 
Combination Rule, operates when data is missing or conflicting and has no strict requirements put upon the origin 
of the probability masses input for combination. However, before a D-S based algorithm can be implemented, 
issues associated with mass assignment, hypothesis definition, uncertainty quantification, and decision rules must 
be resolved. 

The remainder of this paper is organized as follows. Section 2 describes D-S theory, origins, and 
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applications as well as highlighting potential benefits to the BMD community. In Section 3 a review of Bayes’ 
Theory and a traditional classifier are presented. Section 4 outlines the methods used in applying this theory to the 
BMD discrimination task. Results and conclusions are presented in Sections 5 and 6, respectively. 


2.0 DEMPSTER-SHAFER THEORY ORIGINS AND APPLICATIONS 


In the late 1960’s, Arthur P. Dempster was working with the ideas of statistical inference. By 1968, 
Dempster had published several papers dealing with generalizing traditional Bayesian methods^’^. Dempster’s 
ideas were expanded upon in 1976 when Glenn Shafer published the book, A Mathematical Theory of Evidence^. 
The intention of both men was to develop a method for representing uncertainty to provide more flexibility than 
methods available in classical Bayesian approaches. 


Since the ideas presented by Arthur Dempster and Glenn Shafer were introduced, they have been applied 
in a wide variety of fields, including defense, environmental research and medical diagnoses. For example, D-S 
theory has been utilized as a decision aid for the intelligence analyst on the battlefield to speed the force 
commander’s command and control cycle"^. Similarly, sensor fusion in the context of naval warfare has been 
researched to provide commanders knowledge about the location and identity of enemy warcraft^. A D-S algorithm 
has also been implemented to identify land cover in unsupervised multi-source remote sensing^. This paper 
expands on these applications by presenting an approach that was utilized to develop, implement, and test a D-S 
technique for ballistic missile defense target discrimination applications. 


2.1 DEMPSTER-SHAFER THEORY OVERVIEW 

Let 0 denote a finite set of possible hypotheses. This is known as the frame of discernment. For BMD 
discrimination applications, 0 represents a list of object types that may be observed by a seeker during an 
engagement scenario. These object types could include re-entry vehicles (RVs), fragments, and decoy classes. 
Dempster-Shafer theory allows for not only consideration of singleton hypotheses such as {RV}, {fragment}, and 
(decoy) but hypotheses that contain multiple classes such as (RV, decoy) as well. The total number of hypotheses 
possible is the power set, denoted 2®, where the exponent is the cardinality of 0. The power set consists of all 
combinations of the singleton hypotheses as well as the empty set, 0, and the full set, 0. Henceforth, any single 
hypothesis will be referred to as a singleton hypothesis and any subset of two or more hypotheses in 0 will be 
referred to as a “mixed class hypothesis”. 


Bodies of evidence are quantified using a mass function, m. The mass function is defined for every 
element of 2® and must satisfy three properties. 

For Ae 2® 

Ar7(A)G[o,l] (la) 

m{0) = O, for0 the empty set. (1^) 

m(A) = l (ic) 

Ae2^ 

Like probabilities, mass values are between 0 and 1, (la), and the mass associated with a non-event is zero, 
(lb). The total chance associated with the event being tested is 1 as is evident in (Ic). 

Mass information is combined using Dempster’s Rule of Combination. Consider as an example combining 
masses from two probabilistically independent sources, sensor 1 and sensor 2. Let iiii denote mass for derived from 
sensor 1 and m 2 denote mass derived from sensor 2. Let B and C denote any subsets of hypotheses in 0. The 



combined mass for hypothesis A from both sensors, m(i 2 ) (A) is defined by 


/J7(i2)(A) 


iriiiCj) 

BinCj=A 


( 2 ) 


where K= trii {Bi ) I7I2(G ) 

B.nCj=0 


(3) 


Equation (2) presents Dempster’s Rule of Combination for two sources. It is simply the sum of all 
hypotheses whose intersection is hypothesis A, and is normalized by 1 minus K where K, representing the conflict 
in the data, is the sum of all hypotheses whose intersection is the empty set. This is shown in equation (3). As the 
data becomes more contradictory, i.e., the masses more strongly support contradictory hypotheses, K approaches 1 
and a combination of the data makes less sense. In fact, when K reaches 1, a combination of the data is impossible. 
Dempster’s Rule of Combination can be generalized for multiple sources and is commutative and associative^. 


It is important to understand some of the basic differences between traditional Bayesian theory and D-S 
theory. Traditional Bayesian statistics tell us that when an event “A” is expected to occur 20% of the time, the 
remaining 80% is assigned to the hypothesis that event “A” will not occur. In this way, the uncertainty about the 
event is established although any imprecision concerning the occurrence or non-occurrence of the event is not considered. 
In Dempster-Shafer theory, uncertainty is represented through two distinct but related values known as belief (Bel) and 
plausibility (Pis). Belief represents that which is known based on the information given, while plausibility represents that 
which cannot be disproved from the given information. Belief and plausibility functions 
are derived directly from the mass function m by 


Bel(A)= rn{B) 

BcA 

(4a) 

Pis (a)= m (b) 

(4b) 


BnA^0 


From (4a) it is seen that the belief associated with hypothesis A is simply the sum of all hypotheses masses 
that are subsets of A. Plausibility associated with hypothesis A, as stated in (4b), is the sum of all hypotheses 
masses whose intersection with A is non-empty. The relationship between belief and plausibility is seen in 
equation (5) where A is the set theoretic complement of A such that AuA=0, AnA=0. 

Pls(A)=l-Bel(A) (5) 

Like mass values, belief and plausibility values are between zero and one, where belief is the smaller of 
the two values. Hence, belief and plausibility can be thought of as lower and upper probabilities respectively. The 
difference in these two values is a measurement of the imprecision of the uncertainty. As additional data is 
collected and processed, the belief and plausibility values begin to converge thus resulting in a decrease in 
imprecision. 


2.2 POTENTIAL BENEFITS TO BALLISTIC MISSILE DEFENSE 


The Dempster-Shafer theory of evidential reasoning was intended to be a generalized form of Bayes 
theory^. In traditional Bayesian thinking if, based on some body of evidence, a hypothesis A is given a 20% 
chance of occurring, it is automatically assumed that there is an 80% chance of the event not occurring. 
Dempster-Shafer philosophy would argue that the remaining 80% chance should not be assigned to the negation 
of A but to the entire hypothesis set, 0, because there is no evidence to indicate which hypothesis this remaining 



80% chance supports. This explicit notation for assigning uncertainty about an event is one of the primary 
differences between D-S theory and traditional approaches for hypothesis testing. In fact, studies have shown that 
in the absence of this uncertainty measure, D-S and Bayesian results are identical^. The uncertainty associated 
with the event also influences the difference between belief and plausibility values. This difference could be used 
to measure when imprecision related to the uncertainty has decreased enough to make a confident decision. 


Mixed class hypotheses can provide more flexibility in assigning hypothesis’ support from imprecise data. 
For example, an object is estimated to have a certain emissive area. Based on current a priori knowledge of this 
example scenario, it is expected that a booster segment would have an emissive area that is far greater than the 
calculated value. Any hypotheses therefore containing the “booster” element will be eliminated from the list of 
possible classifications. However, it remains to be determined if this object is a RV or a decoy. If the available 
evidence supports both hypotheses such that a confident classification cannot be made, the mass associated with 
the emissive area feature can be assigned to the {RV, decoy} hypothesis, indicating lack of support for the booster 
hypothesis while supporting the possibility that the object is either a RV or a decoy. 


Finally, another potential benefit to missile defense applications is that the Combination Rule sets no strict 
requirements upon the origin of the input masses. Any information source can be utilized for fusion as long as the 
properties of (1) are met. This could provide defense systems the flexibility to incorporate all information sources 
available regardless of the logic required to generate masses from these sources. 


3.0 BAYES THEOREM AND THE QUADRATIC CLASSIFIER 


Classical discrimination is based upon Bayesian methods. A Bayesian quadratic classifier was used as a 
basis for comparison to the D-S algorithm tested in this analysis. The remainder of this section reviews basic 
hypothesis testing for two classes using Bayes’ Theorem and the quadratic classifier employed in this analysis. For 
a more complete explanation of this theory and its application refer to Introduction to Statistical Pattern 
Recognition, Second Edition 

For a two class problem, Bayes’ Theorem for obtaining a posterior probability of belonging to class 
given an unknown sample X is defined as 


P((0\X) 


p{X\(o,)P{co,) 

p{X\(o,)P{(o,) 

i=\ 


for P(ciL)i) = prior probability of COi and P(X COi ) = class conditional density of X given . 
A simple decision rule can be defined as 

P(C0i X) > P(o^ X) ^ coi 
P(C0i X) < P(co^ X) ^ co^ 


( 6 ) 


(V) 


Simply stated, if the posterior probability from (6) for class COi is greater than the posterior probability of class co^ 
given an unknown sample X, then assign X to class COi. The same logic applies for classifying samples to class 
co^. Equation (7) is called Bayes test for minimum error. Other forms of (7) are derived by Fukunaga^. 

For a quadratic classifier, Bayes’ decision rule takes the form of a boundary separating the two class 
distributions. This quadratic boundary is derived from the Gaussian class conditional density function. 



p(X Mi, Ei) = (271)“"'^ Eiexp{-1/2(X-Mif Ei-'(X-M.)} 


(8) 


As depicted in Figure 1, once the classifier is designed and the boundary, or threshold, is set, any sample falling 
in the class cOi side of the threshold is classified as belonging to class COi and any sample falling on the class CO 2 side 
of the threshold is classified as belonging to class . 



Figure 1. Quadratic Classifier in Two Dimensional Feature Space 


4.0 FUSION USING DEMPSTER-SHAFER THEORY 


Fusion using a Dempster-Shafer theory based algorithm is outlined in Figure 2. Evidence, in the form of 
features or classifier output, is translated into masses using the mass function m that is defined for a given 
situation. The masses, including any mixed class hypotheses, are then fused using Dempster’s Rule of 
Combination. At this time, evidence from various sources can be weighted by assigning more mass to the full 
{0} hypothesis thereby representing added uncertainty about the data. This uncertainty measure is problem specific 
and could be based on sensor reliability or feature robustness. After the fusion process is completed, beliefs and 
plausibilities are derived. The final decision logic may then be based on any of several criteria including belief, 
plausibility, or the average of the two. While fusing masses with Dempster’s Rule of Combination and deriving 
beliefs and plausibilities lie at the heart of D-S theory, translating evidence sources into masses, defining “mixed 
class” hypothesis criteria, and identifying the final decision logic remain largely application dependent. 


In this analysis both feature level fusion and classifier level fusion were performed using flight test data. 
Section 4.1 highlights the data set employed by this study. Feature level fusion is explained in Section 4.2 and 
classifier level fusion is described in Section 4.3. 
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Figure 2. Fusion Algorithm Using Dempster-Shafer Theory^ 


4.1 DATA UTILIZED 


The test data utilized to exercise and test the D-S theory was from a recent flight test where several objects 
were deployed during approximately the same time period. The presence of multiple objects allowed for the 
opportunity to collect data to support a discrimination analysis using the D-S algorithm. Feature values for each 
object were derived from two infrared sensors each collecting data in two different wavebands. Sensor 1 collected 
data at a faster rate than Sensor 2. Features chosen for the study were the intensities in band 1 and band 2 for each sensor. 
The target data chosen for this study had significant overlap in the intensity space to provide a stressing 
test case with which to perform a meaningful evaluation of the D-S technique. Each sensor collected data on many 
of the objects deployed during the flight test experiment although only four objects were considered for the initial 
study. One object (object 1) was a RV surrogate, two were fairly stressing decoys (objects 2 and 3), and one object 
(object 4) represented a non-credible target. It should also be noted that only objects 2 and 3 were observed by 
sensor 2. 


Before the data could be used to test the D-S algorithm, two issues associated with employing non- 
simulated data had to be resolved. First, because this was a single flight test, statistical percentages of classification 

















for a number of test trials could not be obtained. Second, because the frame rates for the two sensors of interest 
were different, the data was unbalanced between sensors and uncorrelated in time. As a means of alleviating these 
problems, a binning method was developed and implemented. The time of interest for the flight test was divided 
into 20 equal segments. The data collected by sensor 1 and sensor 2 was grouped according to these 20 time 
segments (bins). The intensity data for each object as collected by each sensor was then averaged in each bin. This 
resulted in 20 “test trials” for each object and each sensor. The number of bins was determined based upon the 
frame rate and frame pattern of the sensors. While one may argue that this binning method and the choice of 20 
test trials is deterministic and statistically questionable, this binning approach seemed to make the best use of 
limited data to provide a means of presenting statistical results. 


4.2 APPLICATION TO BALLISTIC MISSILE TARGET DISCRIMINATION - FEATURE LEVEL 
FUSION 


In order to apply D-S theory to the discrimination problem at the feature level, an a priori database had to 
be populated with feature information and a mass function defined to translate these features into masses. 

Historically, mass function definition has been accomplished by using probabilistic methods^ as well as rule based 
systemsFor this study, a probabilistic approach was chosen since the methods used in this approach are similar 
to those used for traditional Bayesian classification. First, a database was populated with the expected feature 
values for each object. Due to the lack of prior flight test data available for these objects, the database values were 
generated using Monte Carlo target simulations. These simulations utilized pre-flight predictions of the target 
characteristics to generate the expected target intensity space that might be observed during the mission. 

Corresponding feature values were then calculated for the simulations and loaded into a database that was accessed when 
the algorithm was executed. Figure 3a summarizes the steps in this approach. 


After the database was populated, the corresponding feature values from the data had to be derived. The 
probability that the object in track belonged to each class was calculated by comparing mission feature 
measurements to the parameters in the database. For ease of application, a Gaussian distribution was assumed throughout 
this initial analysis. After all class probabilities were calculated for a feature, an uncertainty factor was entered for the full 
0 set. A constant uncertainty value of 40% was utilized for this initial study although this was 
chosen deterministically to prevent belief and plausibility from converging/crossing before all features were utilized. 
After the uncertainty was entered, hypotheses probabilities were normalized to fit mass requirements. This process 
is outlined in Figure 3b. A precise method for quantifying this uncertainty will ultimately be required. 


Because D-S theory allows for “mixed class” hypotheses as well as traditional hypotheses, a method for 
determining what criteria constitutes support for a mixed class had to be developed. Assuming Gaussian data, 
example feature distributions of each singleton class of a RV, stage, and fragment are shown in Figure 4. Features 
falling in the area where the RV and stage probability distributions overlap would be subject to misclassification in 
a traditional Bayesian approach, however, using RV and stage simulations for the features in this overlap area as a 
separate sample set, parameters can be calculated and loaded into the database as parameters for the {RV, Stage} 
class. If two distributions have no significant overlap, as in the case of the RV and fragment distributions shown in Figure 
4, it is reasonable to assume that there is little chance for classification error between the RV and the 
fragment for this feature. Therefore, no mixed class hypothesis distribution parameters should be calculated. 

Masses for the mixed class hypotheses can be determined in the same manner as the singletons. 
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Figure 4. Defining Mixed Classes 


Examining various hypothesis sets provides a means of evaluating how well mixed classes perform as 
defined for this analysis as compared to traditional singleton hypotheses. For fusion at the feature level, three 
different hypothesis types were tested — the traditional singleton hypotheses with the additional full 0 set to handle 
uncertainty, the full set of “mixed class” hypotheses allowed in D-S theory and a third hypothesis set referred to as 
the “complement set”. The complement set was conceived as a compromise between the singleton and “mixed 
class” set due to concerns about computational complexity and is comprised of the singleton hypothesis and their 
complements. For example, in a four-class problem with a RV, object 1, object 2, and object 3, the complement set 
of hypotheses would include the hypotheses listed in Table 1. 


































































Singleton Hypothesis 

Complement 

{RV} 

{not RV} = (obj 1, obj 2, obj 3} 

{obj 1} 

(not obj 11 = (RV, obj 2, obj 3} 

{obj 2} 

(not obj 2} = (RV, obj 1, obj 3} 

{obj 3} 

(not obj 3} = (RV, obj 1, obj 2} 

Uncertainty: 0 

= (RV, obj 1, obj 2, obj 3} 


Table 1. Example of Complement Hypothesis Set 


Because computational complexity grows exponentially with the number of classes, complement 
hypotheses sets could prove to be a suitable compromise between singleton and mixed class hypothesis sets. 


After fusing the sensor data by utilizing the D-S techniques described above, a procedure was required to 
perform the final classification decision. Preliminary testing of the method that would provide the best results 
proved inconclusive. From previous research and the fact that “belief’ is by definition what is known to be true 
based on given evidence, the belief values were chosen for this study as the classifier output to be used in the 
decision making process. Once the decision criterion was selected, it was also necessary to define the final 
classification procedure to be utilized to identify the targets. The procedure can be described using the example 
shown in Figure 5. Each element in the 4x4 matrix is represented by a row heading (hypotheses) and a column 
heading (track file numbers). The first identification is performed by locating the highest value in this matrix and 
assigning the associated track fine number (i.e., column heading) the value of the hypothesis (i.e., row heading). 
The remaining values in this row and column are then eliminated. The next assignment is performed using the 
values in the remaining 3x3 matrix. The highest value in this matrix is located and the track file number in the 
column heading is assigned the corresponding hypothesis value. This process is repeated until all track files are 
assigned a value. Correct identifications are represented by assignments made along the diagonal. Figure 5 shows 
that objects 1 and 4 were correctly classified while objects 2 and 3 were mis-classified. 
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Figure 5. Decision Logic 


4.3 CLASSIFIER LEVEL FUSION 

The second method chosen to test the D-S based algorithm was to perform fusion of data collected by two 
sensors at the classifier level. This was accomplished using the posterior probabilities from (6) that were output 
from the quadratic classifier. These probabilities were normalized to become masses for input into the D-S 
algorithm and fused using the Combination Rule. Only two classes, lethal and non-lethal, were considered. Unlike 
the feature level analysis, zero uncertainty, constant uncertainty, and varying uncertainty between the sensor were 
tested. Another difference from the feature level analysis is that a one-to-one match between hypotheses and track 





files not forced. Each object was determined to be lethal or non-lethal based upon which hypothesis had the largest belief 
value after fusion. 


5.0 ANALYSIS OF RESULTS 


D-S algorithm results using sensor 1 data are seen in Figure 6a. There was little change in performance amongst 
the three hypothesis sets tested. When sensor 2 data was included in the fusion, Figure 6b, performance increased for all 
hypothesis sets tested. However, results using mixed class hypothesis set are not as promising as 
the singleton and complement sets. Also note that object 4 was correctly identified in all cases. 
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(a) Sensor 1 Data Only 
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(b) Sensor 1 Data Fused with Sensor 2 Data 



Figure 6. Feature Level Dempster-Shafer Fusion Results 


The results using the quadratic classifier are shown in Figure 7. Note that the quadratic classifier results 
for sensor 1 alone. Figure 7a, showed improved performance over the D-S algorithm for both sensors combined. 

Figure 2b presents the results obtained with features from sensor 2 alone. The results of testing feature vectors containing 
features derived from data collected by both sensors are shown in Figure 7c. Here the performance decreased marginally 
in identifying the lethal target. This could be explained by the absence of data from the 

lethal target by sensor 2. To represent this lack of data, zero values were input into the feature vectors for sensor 2 
data on object 1. For the analogous D-S case, a 100% uncertainty value was entered for sensor 2 derived features 
for object 1 and no decrease in performance occurred. Although the statistical relevancy of these results is 
questionable, this seems to support the flexibility of the D-S algorithm and a benefit of the explicit uncertainty 
representation contained therein. 











































































Sensor 2 
Quadratic 
Classifier 
Resuits 

TRUTH 

OBJ #1 

OBJ #2 

OBJ #3 

OBJ #4 

CLASSIFICATION 

OBJ #1 

— 

30% 

20% 

— 

OBJ #2 

— 

25% 

60% 

— 

OBJ #3 

— 

35% 

10% 

— 

OBJ #4 

— 

10% 

10% 

— 


Sensor 1 
Quadratic 
Ciassifier 
Resuits 

TRUTH 

OBJ #1 

OBJ #2 

OBJ #3 

OBJ #4 

CLASSIFICATION 

OBJ #1 

55% 

45% 

45% 

5% 

OBJ #2 

25% 

40% 

25% 

0% 

OBJ #3 

20% 

15% 

25% 

0% 

OBJ #4 

0% 

0% 

0% 

100% 


(a) (b) 


Fusion - 
Quadratic 
Ciassifier 

TRUTH 

OBJ #1 

OBJ #2 

OBJ #3 

OBJ #4 

CLASSIFICATION 

OBJ #1 

50% 

25% 

10% 

15% 

OBJ #2 

30% 

60% 

50% 

0% 

OBJ #3 

0% 

10% 

25% 

0% 

OBJ #4 

20% 

5% 

15% 

85% 


(c) 


Figure 7. Feature Level Quadratic Classifier Fusion Results 


One result of the analysis concerns the effectiveness of the complement set of hypotheses. This case 
performed similarly to the analogous singleton case and better than the mixed class cases. Because the complement 
set of hypotheses operates more efficiently, this result could prove helpful as more experimentation is done in 
defining and assigning mass to mixed class hypotheses should this method of testing seem appropriate. 


Results from the D-S algorithm also suggest that the mixed class hypothesis set did not perform as well as 
the singleton and complement sets. One reason for this may be explained through “mixed class dilution.” This 
phenomenon refers to the “watering down” of masses as the mass values are spread over an increasing number of 
hypotheses. Because the sum of the masses must equal 1, each hypothesis mass must assume smaller proportions as 
the number of hypothesis grows. Hypotheses supporting correct classification thus have less power in affecting the 
final decision when a full set of mixed class hypotheses is possible. The effects of dilution are illustrated in Figure 8. The 
figure represents a four class problem whose hypothesis sets are labeled across the horizontal axis. The 
vertical axis depicts example mass measurements. As can be seen from the figure, masses obtained for the mixed 
class hypothesis set show very little variation for the fourteen hypotheses. With the decrease in the number of 
hypotheses for the complement case, more variation in the mass values is evident. Finally, the most variation in 
mass values is seen in the four singleton hypotheses. Logic that assigns values to a mixed class hypothesis only 
when the masses for the singleton hypotheses comprising the mixed class vary slightly potentially could be a means 
of improving performance in the mixed class case. 
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Figure 8. Mass Dilution Example 


Figure 9 highlights the results of the classifier level fusion analysis for the quadratic classifier and D-S 
technique. The quadratic classifier results for sensor 1 are shown in Figure 9a for lethal and non-lethal classes 
only. Similarly, Figure 9b presents the results of the quadratic classifier for sensor 2 only. Fusion performance 
using the D-S technique with zero uncertainty and with constant uncertainty was identical to the Bayesian 
technique employed. These results are shown in Figure 9c. Here, it is apparent that fusion at the classifier level 
yielded improved results over the feature level fusion for both the D-S and Bayes classifiers. As indicated in 
Figure 9d, the percentage of non-lethal target classified as lethal targets increased marginally when the uncertainty 
was varied between the sensors for the D-S algorithm. However, the percentage of lethal targets classified correctly 
did not decrease. 
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Figure 9. Classifier Level Fusion Results 


While overall discrimination performance for both the D-S and Bayes’ classifiers was low for this 
stressing data set, observations concerning the nature of D-S algorithm performance could be ascertained and 
additional work identified. 




















































6.0 CONCLUSIONS 


A Dempster-Shafer algorithm was developed, implemented, and tested using flight test data observed by 
two infrared sensors. Performance between the D-S methods presented and traditional Bayesian techniques was 
compared. Preliminary results indicate that sensor fusion results at the classifier level may be comparable to those 
of traditional classifiers when the basic concepts of D-S theory are implemented. It also appears that fusion at the 
classifier level produce better results than fusion at the feature level for both traditional and D-S classifiers. Results 
also support the possibility of using the complement hypothesis set to reduce computational complexity in real-time 
applications. Clearly, these results would benefit from validation provided by a sensitivity analysis employing 
Monte Carlo testing of synthetic data. 


7.0 FUTURE WORK 


There are several areas to be addressed in future work. The approach for defining the mass function is 
based on a Gaussian distribution and heavy emphasis is placed on using simulations. These are strong assumptions 
that may deviate significantly from real-world observations. While the mixed class hypotheses can provide 
flexibility, it has been observed that feature masses supporting correct hypotheses can be diluted when mixed 
classes are defined using the methods presented here. Additional work is also required to arrive at the most robust criteria 
for decision making for each application. An examination of processing limitations must be initiated to determine if a D-S 
algorithm can be successfully implemented because computational complexity increases with 

each additional class. Finally, a procedure also needs to be developed to define independent uncertainty used in assigning 
mass to the full {0} hypothesis. This explicit uncertainty quantification sets D-S methods apart from traditional 
approaches and could be the primary benefit, and issue, in applying Dempster-Shafer methods to solve 
the problems presented in BMD scenarios. Using simulated scenarios for additional testing could be the key to 
solving these problems. Such a simulated scenario might include multiple targets ranging from RV’s and stressing 
decoys to boosters and debris. Three sensors operating at high frame rates to derive temporal features could 
generate data on these objects. Such a scenario would allow for variation among the sensors’ signal to noise ratios 
as well as the number of objects observed by each sensor. Detailed sensitivity studies regarding the amount of 
uncertainty attributed to each sensor, feature, or object would be possible. Monte Carlo testing of the D-S 
algorithm for such scenarios would lend statistical significance to all observations. 
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